Search CORE

8 research outputs found

FM-index on GPU : a cooperative scheme to reduce memory footprint

Author: Chacón de San Baldomero Alejandro
Marco-Sola Santiago
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

The FM-index is a data structure which is seeing more and more pervasive use, in particular in the field of highthroughput bioinformatics. Algorithms based on it show a pseudo-random memory access pattern. As a consequence, they are usually bound by memory bandwidth rather than CPU usage. Naive GPU implementations are no exception. Here we show that the combination of a compact design of the FM-index and a thread-cooperative approach can be used to restore a proper balance. The resulting solution is less memory-bandwidth intensive, and allows full exploitation of the computational resources of the GPU across several GPU architectures

Diposit Digital de Documents de la UAB

Thread-cooperative, bit-parallel computation of Levenshtein distance on GPU

Author: Chacón de San Baldomero Alejandro
Marco-Sola Santiago
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Approximate string matching is a very important problem in computational biology; it requires the fast computation of string distance as one of its essential components. Myers' bit-parallel algorithm improves the classical dynamic programming approach to Levenshtein distance computation, and offers competitive performance on CPUs. The main challenge when designing an efficient GPU implementation is to expose enough SIMD parallelism while at the same time keeping a relatively small working set for each thread. In this work we implement and optimise a CUDA version of Myers' algorithm suitable to be used as a building block for DNA sequence alignment. We achieve high efficiency by means of a cooperative parallelisation strategy for (1) very-long integer addition and shift operations, and (2) several simultaneous pattern matching tasks. In addition, we explore the performance impact obtained when using features specific to the Kepler architecture. Our results show an overall performance of the order of tera cells updates per second using a single high-end Nvidia GPU, and factor speedups in excess of 20 with respect to a sixteen-core, non-vectorised CPU implementation

Diposit Digital de Documents de la UAB

Optimització d'una aplicació bioinformàtica d'aliniament de seqüències executada en processadors many-core (GPUs)

Author: Chacón de San Baldomero Alejandro
Moure López Juan Carlos
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2011
Field of study

Las herramientas de análisis de secuencias genómicas permiten a los biólogos identificar y entender regiones fundamentales que tienen implicación en enfermedades genéticas. Actualmente existe una necesidad de dotar al ámbito científico de herramientas de análisis eficientes. Este proyecto lleva a cabo una caracterización y análisis del rendimiento de algoritmos utilizados en la comparación de secuencias genómicas completas, y ejecutadas en arquitecturas MultiCore y ManyCore. A partir del análisis se evalúa la idoneidad de este tipo de arquitecturas para resolver el problema de comparar secuencias genómicas. Finalmente se propone una serie de modificaciones en las implementaciones de estos algoritmos con el objetivo de mejorar el rendimiento.Les eines d'anàlisi de seqüències genòmiques permeten als biòlegs identificar i entendre regions fonamentals que tenen implicació en malalties genètiques. Actualment hi ha una necessitat d'aportar a l'àmbit científic eines d'anàlisi eficients. Aquest projecte desenvolupa una caracterització i anàlisi del rendiment d'algoritmes utilitzats en la comparació de seqüències genòmiques completes executades en arquitectures MultiCore i ManyCore. A partir de l'anàlisi s'evalua la idoneïtat d'aquest tipus d'arquitectures per resoldre el problema de la comparació de seqüències genòmiques. Finalment es proposen una sèrie de modificacions en les implementacions d'aquests algoritmes amb l'objectiu de millorar el rendiment.The analysis tools of the genomic sequence allow biologists to identify and understand the basic regions that are involved in genetic diseases. Nowadays there is the necessity to give the science efficiency analyse tools. This project makes a characterisation and analysis of the output in the algorithms used on the complete sequence comparison, performed on MultiCore and ManyCore architectures. From this analysis the suitability of this kind of architectures on the solution of the comparison gene sequence is evaluated. Finally a series of modifications for the implementations of these algorithms are proposed, to allow the output improvement

Diposit Digital de Documents de la UAB

Boosting the FM-index on the GPU : effective techniques to mitigate random memory access

Author: Chacón de San Baldomero Alejandro
Espinosa Antonio
Marco-Sola Santiago
Moure López Juan Carlos
Ribeca Paolo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The recent advent of high-throughput sequencing machines producing big amounts of short reads has boosted the interest in efficient string searching techniques. As of today, many mainstream sequence alignment software tools rely on a special data structure, called the FM-index, which allows for fast exact searches in large genomic references. However, such searches translate into a pseudo-random memory access pattern, thus making memory access the limiting factor of all computation-efficient implementations, both on CPUs and GPUs. Here we show that several strategies can be put in place to remove the memory bottleneck on the GPU: more compact indexes can be implemented by having more threads work cooperatively on larger memory blocks, and a k-step FM-index can be used to further reduce the number of memory accesses. The combination of those and other optimisations yields an implementation that is able to process about 2 Gbases of queries per second on our test platform, being about 8× faster than a comparable multi-core CPU version, and about 3× to 5× faster than the FM-index implementation on the GPU provided by the recently announced Nvidia NVBIO bioinformatics library

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Optimització d'una aplicació bioinformàtica d'aliniament de seqüències executada en processadors many-core (GPUs)

Author: Chacón de San Baldomero Alejandro
Publication venue
Publication date: 01/09/2011
Field of study

Las herramientas de análisis de secuencias genómicas permiten a los biólogos identificar y entender regiones fundamentales que tienen implicación en enfermedades genéticas. Actualmente existe una necesidad de dotar al ámbito científico de herramientas de análisis eficientes. Este proyecto lleva a cabo una caracterización y análisis del rendimiento de algoritmos utilizados en la comparación de secuencias genómicas completas, y ejecutadas en arquitecturas MultiCore y ManyCore. A partir del análisis se evalúa la idoneidad de este tipo de arquitecturas para resolver el problema de comparar secuencias genómicas. Finalmente se propone una serie de modificaciones en las implementaciones de estos algoritmos con el objetivo de mejorar el rendimiento.Les eines d'anàlisi de seqüències genòmiques permeten als biòlegs identificar i entendre regions fonamentals que tenen implicació en malalties genètiques. Actualment hi ha una necessitat d'aportar a l'àmbit científic eines d'anàlisi eficients. Aquest projecte desenvolupa una caracterització i anàlisi del rendiment d'algoritmes utilitzats en la comparació de seqüències genòmiques completes executades en arquitectures MultiCore i ManyCore. A partir de l’anàlisi s'evalua la idoneïtat d'aquest tipus d'arquitectures per resoldre el problema de la comparació de seqüències genòmiques. Finalment es proposen una sèrie de modificacions en les implementacions d'aquests algoritmes amb l'objectiu de millorar el rendiment.The analysis tools of the genomic sequence allow biologists to identify and understand the basic regions that are involved in genetic diseases. Nowadays there is the necessity to give the science efficiency analyse tools. This project makes a characterisation and analysis of the output in the algorithms used on the complete sequence comparison, performed on MultiCore and ManyCore architectures. From this analysis the suitability of this kind of architectures on the solution of the comparison gene sequence is evaluated. Finally a series of modifications for the implementations of these algorithms are proposed, to allow the output improvement

RECERCAT

Thread-cooperative, bit-parallel computation of Levenshtein distance on GPU

Author: Chacón de San Baldomero Alejandro
Marco-Sola Santiago
Publication venue
Publication date: 21/04/2021
Field of study

RECERCAT

FM-index on GPU : a cooperative scheme to reduce memory footprint

Author: Chacón de San Baldomero Alejandro
Marco-Sola Santiago
Publication venue
Publication date: 21/04/2021
Field of study

RECERCAT

Boosting the FM-index on the GPU : effective techniques to mitigate random memory access

Author: Chacón de San Baldomero Alejandro
Espinosa Morales Antonio Miguel
Marco-Sola Santiago
Moure Juan C
Ribeca Paolo
Publication venue
Publication date: 14/01/2021
Field of study

RECERCAT